(19) 



3 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 




(12) 



(n) EP 0 718 595 A2 

EUROPEAN PATENT APPLICATION 



(43) Date of publication: 

26.06.1996 Bulletin 1996/26 

(21) Application number: 95120201.9 

(22) Date of filing: 20.12.1995 



(51) IntCI .6 G01B 11/06 



(84) Designated Contracting States: 
DE FR GB NL 

(30) Priority: 21.12.1994 US 360535 

(71) Applicant: Hughes Aircraft Company 

Los Angeles, California 90045-0066 (US) 



(72) Inventor: Ledger, Anthony M. 

Newfairfield, Connecticut 06812 (US) 

(74) Representative: Witte, Alexander, Dr.-lng. et al 
Witte, Weller, Gahlert, Otten & Steil, 
Patentanwalte, 
Rotebuhlstrasse 121 
D- 701 78 Stuttgart (DE) 



(54) Automatic rejection of diffraction effects in thin film metrology 



(57) A layer thickness determination system (10) is 
employed for detecting a thickness of at least one layer 
(12a) disposed over a surface of a wafer (13) having 
one or more first regions characterized by circuit and 
other features, and one or more second regions char- 
acterized by an absence of circuit and other features. 
The determination system (10) includes an optical sys- 
tem (14) for collecting light reflecting from the at least 
one layer (12a) and the surface of the wafer (13). The 
optical system (14) is preferably a telecentric optical 
system, and has an optical axis (OA: 1 4a) and a narrow 
cone of acceptance angles disposed about the optical 
axis (OA; 14a). The determination system (10) further 
includes a camera (16) coupled to the optical system 
(14) for obtaining an image from the collected light; a 
first light source (22) for illuminating the layer ( 1 2a) with 
light that is directed along the optical axis (OA: 1 4a) and 



within the cone of acceptance angles; and at least one 
second light source (18) for illuminating the layer (12a) 
with light that is directed off the optical axis (OA; 14a) 
and outside of the cone of acceptance angles A data 
processor has an input coupled to an output of the cam- 
era (16) for obtaining from the camera (16) first pixel 
data corresponding to at least one first image obtained 
with light from the first light source (22) and for obtaining 
second pixel data corresponding to at least one second 
image obtained with light from the at least one second 
light source (18). The data processor operates to gen- 
erate an image mask from the second pixel data for dis- 
tinguishing the first wafer regions from the second wafer 
regions, and further operates to detect a thickness of 
the at least one layer (12a) within the second regions in 
accordance with the first pixel data and in accordance 
with predetermined reference pixel data (Fig. 1). 
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Description 

FIELD OF THE INVENTION! 



BACKGROUND OF THE INVENTION - 
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the thickness profile of other types of intermediate layers that may be deposited on existing patterns for, by example 
quality control and diagnosis purposes. 

It is thus one object of this invention to accomplish the determination of the thickness profile of one or more layers 
or films in a rapid manner so as not to unduly impact the throughput of a semiconductor fabrication line. It is a further 
object of this invention to accomplish the determination of the thickness profile of one or more layers or films without 
requiring a priori knowledge of the types of underlying scattering and diffracting features, their geometry, or their loca- 
tions (both spatial and/or angular), and to also not require the use of precise positioning tables and the like. 

SUMMARY OF THE INVENTION 

The foregoing and other problems are overcome and the objects of the invention are realized by a method of 
optically cataloging the surface of a patterned wafer into regions having different optical scatter properties., so that 
errors in the computation of thickness or optical constant maps can be reduced. A first level of screening separates 
purely planar film system regions from those containing scattering and diffracting features. Further differentiation be- 
tween different planar film designs at different places on the wafer is accomplished by using different numerical spectral 
libraries, and a subsequent determination of which spectral library provides a best tit (lowest merit function) over a 
given area. Typically the measurements are made with prior knowledge of the type of film systems to expect, but not 
necessarily where individual areas are located or how they are aligned. 

Differentiating between areas which have different diffractive and scattering signatures (different circuit patterns, 
line directions, etc.) may be accomplished using multiple libraries which include a coherent coupling between the film 
surfaces and the diffracting and scattering structures. In this case library computation requires knowledge of the circuit 
spatial details and their orientation, as well as the optical properties (e.g., index of refraction at various wavelengths) 
of the materials. 

The teaching of this invention enables the measurement process for planar layers to occur in a rapid manner, and 
does not require that the wafer be precisely positioned under the optical measurement system. 

The teaching of this invention employs a high resolution, narrow field of view multispectral full aperture imaging 
system which incorporates an automatic system for determining which regions on a wafer contain planar areas and 
which regions contain scattering and diffracting regions. By example, two white light images, taken with oblique illumi- 
nation in orthogonal directions, are used to create a binary image mask which is used to prevent thickness computations 
from being carried out in areas which contain circuit features of any type (e.g. . edges, rectangles, sub-micron features, 
die edge lines, etc.). This technique speeds the measurement process and significantly reduces the number of erro- 
neous values resulting from scattering, diffraction, and defects, since these areas are automatically avoided. 

A layer thickness determination system in accordance with this invention is employed for detecting a thickness of 
at least one layer disposed over a surface of a wafer having one or more first regions characterized by circuit and other 
features, and one or more second regions characterized by an absence of circuit and other features. The system 
includes an optical system for collecting light reflecting from the at least one layer and the surface of the wafer. The 
optical system is preferably a telecentric optical system, and has an optical axis and a narrow cone of acceptance 
angles disposed about the optical axis. The system further includes a camera coupled to the optical system for obtaining 
an image from the collected light; a first light source, which includes filters, for illuminating the layer with substantially 
monochromatic light that is directed along the optical axis and within the cone of acceptance angles: and at least one 
second light source for illuminating the layer with light that is directed off the optical axis and outside of the cone of 
acceptance angles. A data processor has an input coupled to an output of the camera for obtaining from the camera 
first pixel data corresponding to at least one first image obtained with light from the first light source and for obtaining 
second pixel data corresponding to at least one second image obtained with light from the at least one second light 
source. The data processor operates to generate an image mask from the second pixel data for distinguishing the first 
wafer regions from the second wafer regions, and f unher operates to detect a thickness of the at least one layer within 
the second regions in accordance with the first pixel data and in accordance with predetermined reference pixel data. 

The first multispectral light source provides illumination with a first incidence angle on the layer, the first incidence 
angle being an angle that causes specularly reflected light to enter the cone of acceptance angles of the optical system. 
The second light source (which may a part of the first light source) provides illumination with a second incidence angle 
on the layer, the second incidence angle being an angle that causes specularly reflected light to not enter the cone of 
acceptance angles of the optical system. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The above set forth and other features of the invention are made more apparent in the ensuing Detailed Description 
of the Invention when read in conjunction with the attached Drawings, wherein: 
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Fig. 1 is a simplified block diagram illustrating a film thickness measurement system in accordance with this in- 
vention; 

Fig. 2 illustrates process steps in accordance with a prior art method of measuring film thickness for a film disposed 
on a planar, unpatterned substrate: 

Ftg. 3 illustrates process steps in accordance with this invention for measuring film thickness for a film disposed 
on a patterned substrate: 

Figs. 4a-4e depict images of a patterned substrate having a film layer, and illustrate a method for deriving a mask 
to select areas that avoid scattering and diffracting patterns; 

Fig. 4f is an enlarged cross-sectional view, not to scale, that illustrates the use of a planar reference surface within 
the field of view of the camera of Fig. 1 ; 

Fig. 4g is an enlarged cross-sectional view that illustrates a portion of a wafer and two Si0 2 films; 

Fig. 5a illustrates a thickness map for a portion of a Si0 2 layer on a patterned substrate without the use of the 
mask generated in accordance with this invention; 

Fig. 5b illustrates a thickness map for the portion of the Si0 2 layer on the patterned substrate with the use of the 
mask generated in accordance with this invention; 

Fig 6a is a histogram corresponding to the unmasked thickness map of Fig. 5a: 
Fig. 6b is a histogram corresponding to the masked thickness map of Fig. 5b: 

Fig. 7a is an exemplary 2.06 micron layer thickness map obtained by masking and thresholding in accordance 
with this invention; 

Fig. 7b is an exemplary 1.33 micron layer thickness map obtained by masking and thresholding in accordance 
with this invention: 

Fig. 8a is a merit function map corresponding to the unmasked thickness map of Fig. 5a; 
Fig. 8b is a merit function map corresponding to the masked thickness map of Fig. 5b: 

Fig. 9 is a logic flow diagram illustrating the use of a plurality of planar and diffracting libraries in accordance with 
an aspect of this invention; 

Fig. 10a is diagram of a conventional imaging system; 

Fig. 1 0b is a diagram of a telecentric optical system that is a presently preferred embodiment of an imaging system 
for use with this invention; 

Fig. lla is a block diagram of a thickness determination system in accordance with a first embodiment of this 
invention; and 

Fig. 11b illustrates a filter wheel for use in a second embodiment of this invention. 
DETAILED DESCRIPTION OF THE INVENTION 

It is first noted that the teaching of this invention is applicable in general to any structured optical pattern, whether 
it be, by example, a patterned wafer, a liquid crystal display, or a biological sample. As such, and although the invention 
is described herein in the context of a semiconductor wafer fabrication application, the teaching of this invention is not 
to be construed to be limited in scope to only the determination of a film thickness upon a semiconductor wafer. The 
terms "film" and "layer" are used interchangeably herein, and are both intended to encompass a region comprised of 
a first material (e.g., Si0 2 ) that is disposed upon or over a second material (e.g., Si), wherein the first material has at 
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least one optical characteristic (e.g.. index of refraction) that is different than that of the second material. 

Patterned wafers at first glance contain a bewildering number of features, patterns and sub-patterns having di- 
mensions that range in size from sub-micron to some hundreds of microns. However, most of the sub-patterns are 
extremely repetitive and contain a limited number of line segment angles. The inventor has recognized that regions 
5 on a wafer can be considered to fall into two broad categories. 

A first category includes areas that contain film structures which consist of only planar layers. These regions typ- 
ically are the scribe alleys between dies and the areas between blocks of sub-micron features. Areas free of circuit 
structures may also occur during each step of the die fabrication process. The measurement of this type of region can 
be accomplished using multispectral reflectometry combined with a search of pre-computed libraries, as disclosed in 
^0 the above-mentioned commonly assigned U.S. Patents 5,291.269. 5.293.214 and 5.333,049. which are each incor- 
porated by reference herein in their entireties. 

A second category of wafer regions contain circuit details, such as sub-micron integrated circuit structures, metal 
traces, die lines, capacitors, and also local defects caused by processing. This second category of wafer regions is 
more difficult to characterize because the combination of one or more thin film layers, that are coherently coupled to 
such microscopic patterns, alters the phase and amplitude of the reflected light. In addition, these spatial variations 
tend to scatter and diffract light in non-specular directions. Numerical libraries pre-computed for this type of region 
require a knowledge of the material optical properties, as well as a knowledge of the mask patterns used at each stage 
of manufacture. 

As employed herein the term 'coherent coupling' has the following meaning. If two optical surfaces have intensity 
transmissions of T1 and T2. then the combined transmission is simply T1 times T2. However, if the surfaces are co- 
herently coupled together, then the resultant transmission is no longer the simple product, but depends instead on the 
amplitude and the phase of the transmission coefficient at each surface, as well as on the distance between the two 
surfaces. 

A basic principle of this invention is illustrated in the optical system 10 shown in Fig. 1, where a surface 12 to be 
measured is viewed at normal incidence using an optical system 1 4 having a large F/number. As a result, the only light 
which can reach a multi-pixel CCD camera 16 is that which leaves the surface 12 within a few degrees of the optical 
axis 14a of the optical system 14. that is. light having angles within an acceptance cone of the optical system 14. It is 
noted that non-normal incidence can also be employed, providing that the image plane of the camera 16 is tilted and 
the illumination sources are correctly positioned. 

The optical system 14 is preferably a full aperture telecentric system having a narrow cone of acceptance angles 
Referring briefly to Fig. 10a there is illustrated a conventional imaging system used with a camera focal plane. A lens 
is disposed above a surface to be imaged at the camera focal plane. In the conventional imaging system the central 
rays (CR) are not perpendicular to the surface being imaged. As a result, different points in the field of view (FOV) of 
the camera are not imaged in the same manner. The conventional imaging system is characterized by a working 
distance (WD) measured in millimeters, a FOV (F/1 ) that is measured in tens of degrees, and a depth of field measured 
in microns. 

In contradistinction, and referring to Fig. 10b, a telecentric imaging system is characterized by two lenses (LI and 
L2), and an aperture (AP) that is located one focal length from L2. In the telecentric system the chief central rays (CR) 
are all parallel and perpendicular to the surface being imaged. The telecentric imaging system is characterized by a 
working distance (WD) measured in centimeters, a FOV that of approximately one degree (i.e., a narrow cone of 
acceptance angles), and a depth of field measured in millimeters. 

Referring again to Fig. 1. a white (broadband) light source 18 and condensing lens 20 provide illumination of the 
surface 12 under test. If the surface 12 under test is illuminated with light incident at an angle greater than half the 
acceptance angle of the optical system 1 4, and if the surface 1 2 contains only planar layers, then none of the specularly 
reflected light enters the optical system and the image intensity corresponds to the black level of the CCD camera 16. 
That is, no image is detected. This specularly reflected light is indicated generally as 18a. 

If, on the other hand, the surface 1 2 being viewed contains micron and sub-micron sized patterns of any type, then 
some lighl (designaled as 18b) will be deviated by diffraction and scattering into the narrow acceptance cone of the 
optical system 14 and. hence, to the camera 16. The resulting detected pixel image thus includes bright areas corre- 
sponding to scattering and diffracting features, such as edges, that are embedded in the patterned regions. 

In practice the optical system 1 4 shown in Fig. 1 also includes an on-axis filtered light source 22 and a beamsplitter 
(not shown in Fig. 1) placed between the surface 12 under test and the camera 16. This permits the recording and 
digitization of multispectral images which are used to measure the optical spectra at each pixel location, as is described 
in detail below. 

Diffraction occurs when tight illuminates a surface where the complex refractive index changes abruptly over the 
surface. The amount of light diffracted in an optical system is approximately proportional to the total length of an illu- 
minated edge, multiplied by both the wavelength of the light and the intensity of the light beam (watts/cm). In conven- 
tional optical systems this is typically an extremely small amount since only a few apertures are illuminated. The op- 



30 



>JSDOClD: <EP 0718595A2.I_> 



EP 0 718 595 A2 



w 



15 



posite is true, however, in the case of patterned wafers, where the total length of the lines (corresponding to circuit 
features) which can diffract light is enormous. The large number of edge features results in large amounts of incident 
light being redirected in non-specular directions. 

If images of the patterned wafer surface are recorded while being illuminated from at least one. and preferably two 
or more different directions using at least one and possibly a large range of incident angles, wavelengths and polari- 
zations, then a mask can be created from these digitized images. The mask is then used to differentiate those regions 
of the wafer surface that are free of scattering and diffracting features from those regions that are not. In accordance 
with this invention, the application of the mask identifies those regions where specular reflection can be used to detect 
the thickness of the planar layer(s), as in the aforementioned U.S. Patent 5,333.049. In the simplest case the mask is 
a two state (binary) mask, but in general multiple (three or more) level masks can be obtained depending upon the 
optical effects to be masked. 

In the case of the two state (i.e., binary) mask, thickness computations using spectral libraries precomputed for 
planar layers are only valid in those regions of the mask where the diffracted light level is zero, i.e., the black regions. 
Planar spectral libraries of a type employed in the above-mentioned U.S. Patents 5.291 ,269, 5,293,214 and 5 333 049 
which have been incorporated by reference herein in their entireties, are not in general valid in the bright regions since 
the spectral signature of such regions is strongly influenced by the diffractive losses. 

In consequence, a simple image mask, for example a binary mask, is used to reduce the number of points to be 
computed in order to determine a thickness profile of at least one layer, such as a CMP layer 12a, that overlies a 
patterned surface of a wafer 13. This results in increased processing speed and a reduction in errors in the resultant 
20 thickness map(s). since the diffracting areas are automatically screened out and eliminated from the film thickness 
determination. 

Reference is now made to Fig. 11a for showing in greater detail one presently preferred embodiment of this in- 
vention. A primary white light source (LS1) provides a beam to a condensing lens (L1 ) which focusses the beam onto 
a filter wheel 24 having a plurality of different filters 24a each of which passes a different wavelength. Filtered light 

25 emanating from the filter wheel 24 is collimated by a second lens (L2) and is directed to a 50/50 beamsplitter 26. Light 
reflecting from the beamsplitter 26 is directed along the optical axis 14a that is normal to the surface of the wafer 13 
Light that is specularly reflected from the surface of the wafer 13 is collected by the telecentric optical system 14 and 
is focussed at the image plane of the CCD camera 16, as was described previously with regard to Fig. 10b. The output 
of the CCD camera 16 is provided to a conventional frame grabber 16a which stores, for each position of the filter 

30 wheel 24. the resulting image or light signature. A complete frame comprised of pixel intensity values is read out by a 
processor 28a which stores the frame in a memory 28b. The processor 28a controls the position of the filter wheel 24 
so as to obtain N images of the wafer, wherein each image corresponds to illumination of the wafer with light of a 
predetermined wavelength as selected by the particular filter that is interposed in the path of the white light beam from 
source LS1 . At least one predetermined spectral library is stored in memory 28c, which is subsequently accessed and 

35 compared to the stored pixel values in the memory 28b. 

The spectral library 28c contains a description of a set of reflectance curves for a range of parameter values such 
as film thickness. For example, and referring to Fig. 4g, if there are two films or layers deposited onto a silicon wafer 
the films w.ll have optical properties (n, k, I,) and (n 2 ^ t 2 ), where t, and t 2 are the thicknesses of films 1 and 2 
respectively, n, and n 2 are the refractive indices of films 1 and 2. respectively, and and k 2 are the absorption constants 

•*o of films 1 and 2, respectively. The substrate also has a refractive index n s and an absorption constant The index of 
refraction and absorption values are wavelength dependent, and are actually given by n(X), k(X) for all films and the 
substrate. 

The reflectance (R) of light at wavelength X for film thicknesses l, and t 2 for the two materials can be written as: 

If it is desired to measure the second film thickness t 2 . then a series of reflectances R v .R m can be pre-computed 
at wavelengths X t ..\ m for each value of the unknown thickness parameter t 2 . 

All these spectrum are sampled data sets containing values. The library 28c therefore contains all possible reflect- 
ance (or transmission) spectrum (or some normalized version thereof) which are expected during a sinqle measure- 
50 ment. 

When a wafer is measured there is obtained a set of reflectance values (P) for wavelengths \ v Xj> ... \ m For those 
reflectance values at the same m wavelength in the library 28c, the processor 28a determines which spectral pattern 
in the library 28c most closely matches the measured reflectance value. The thickness associated with a selected 
spectral pattern is thus correlated with the thickness of the film being measured. 
S5 In the simplest calculation a merit function M(t 2 ) is derived which is a least squares function formed from the 

measured spectrum and one of the library spectra: 



M0 2 ) = (P, - R,^)) 2 + (P 2 - R 2 ( i 2 )) 2 + .. (Pn . R m (t2)) 2 
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Clearly, ,f the measured reflectance values P, .. P m were all exactly the same as the precomputed reflectances R 
(l 2 ) Rm<<2) some unknown t 2 , then M will be zero and a perfect match will have been found In practice noise in 
the measurement system causes M to rarely equal zero. As a result, the goal is to determine a merit function M(U) 
hav.ng a minimum value. This is accomplished by finding the sum in the foregoing equation for all values of to and 
s then choosing the t 2 value which gives the minimum merit function M(t,). The selected t, value is thus taken to be the 
thickness of the layer 2 in Fig. 4g. 

Referring again to Fig. 11 a. and in accordance with this invention, at least two additional light sources LS2 and 
LS3 are proved for .lluminating the wafer at angles that are off of the optical axis (OA). The wafer images obtained 
with these two additional light sources are employed to detect those regions of the wafer surface that are subject to 
w non-specular reflections, i.e.. those regions having features that scatter and diffract the illumination. The scattering 
and diffracting images are also stored within the memory 28b. and are processed as described below to generate a 
binary mask def.n.ng those regions of the wafer wherein the data stored in the spectral library 28c may give incorrect 
thickness results. 

The two oblique light sources L2 and L3 may be replaced by a single source that surrounds the wafer 1 3 Alter- 
is natively, and as is illustrated in Fig. 11b, the filter wheel 24 may have two positions 25a and 25b wherein a feature 25c 
such as a centrally located opaque region, causes the incident light to be diffracted away from the axis normal to the 
surface of the filter wheel 24. This can be seen by contrasting the normal ray A with the diffracted ray B The result is 
that the ray B will strike the surface of the wafer 13 at an angle that diverges from the optical axis 14a which is the 
desired result. By providing two regions 25a and 25b that bend the light in two different directions, it is possible to 
20 illuminate the waler 13 from two different directions, as will be described below with respect to Figs 4b 4c and 4d 
Representative dimensions for the regions 25a and 25b are a diameter of one inch, and a width of 0.75 inch for the 
centrally located opaque regions 25c. 

Fig. 2 illustrates the computation flow for a conventional multi-spectral case. In Block A N images are acquired at 
N d.fferent wavelengths. In Block B a computation rectangle is defined from the N images. In Block C a spectral curve 
25 IS obtained for oach pixel within the computational rectangle. The spectral curve indicates the amount of light reaching 
the camera for each of the N wavelengths. At Block D a pre-calculated spectral library (Block E) is accessed to obtain 
a best fit film thickness. This is a recursive process, with control flowing back to Block C until a best fit curve is obtained 
for all pixels .n the computational rectangle. At Block F a thickness (t) is output from the processor 28a for each (x y) 
pixel position within the computational rectangle. 
30, Fig. 3 illustrates the computation flow in accordance with this invention, wherein only pixels in mask-specified 

planar layer regions are processed. In Block A N images are acquired at N different wavelengths. In Block B at least 
two scatter/diffraction images are acquired. In Block C the processor 28a generates a binary mask based on the scat- 
- ter/diffraction images that are acquired in Block B. In Block D a computation rectangle is defined based on the mask 
obtained in Block C. In Block E a determination is made, for each pixel in the computation rectangle, whether the merit 
35 function (M) ,s equal to zero or one. If M=0 the pixel is rejected, while if M=1 the pixel is selected and the method 
continues to Block F where a spectral curve is obtained for each selected pixel within the computational rectangle As 
before, the spectral curve indicates the amount of light reaching the camera for each of the N wavelengths At Block 
G a pre-calculated spectral library (Block I) is accessed to obtain a best fit film thickness. At Block F a thickness (t) is 
output for each (x.y) selected pixel position within the computational rectangle. In accordance with this invention the 
selected pixels are those that correspond to a planar layer region of the surface under test that is tree of underlvino 
scattering and diffracting features. 

Figs. 4a-4e illustrate the principle of the mask generation method for a patterned wafer which has been coated 
wrth at least one layer of silicon dioxide (Si0 2 ) as part of a planarization process. The images shown in Figs 4a-4e 
are approximately l/8th of the total image collected by the frame grabber 16a that is associated with the CCD camera 
•*5 16 of Fig. 1. 

It can be seen that the substantially monochromatic image (one of many taken during a measurement) shown in 
Fig. 4a does not provide any indication concerning which areas of the wafer contain scattering/diffracting regions and 
which areas contain only planar (spectrally reflecting) layers. The images shown in Figs. 4b and 4c are scatter images 
taken using the off-axis light source 18 at two different illumination directions (front and side). Essentially these two 
5o images are "dark field" images which are combined, pixel-by-pixel, to form a composite scattering and diffracting image 
that is shown in Fig. 4d. The two scatter images in this case were taken with the light sources 18 in approximately 
orthogonal directions. This is evident from the different orientations of the line pair images on the left side of the imaoes 
in Figs. 4b and 4c. 

Referring to Fig. 4g, these line pair images correspond to the horizontal and vertical edges of square apertures 
55 that were etched into a Si0 2 CMP layer before the entire wafer was coated with a second layer of Si0 2 Off-axis 
illumination that strikes the edges is scattered and diffracted into the acceptance cone of the optical system 1 4 indicated 
by rays C and D. and are detected by the camera 1 6. Those rays that strike only planar film regions, without underlying 
substrate c.rcu.t features, are specularly reflected (rays A and B). and do not enter the acceptance cone of the optical 
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system 1 4. That .s, (or the specularly reflected rays A and B the angle of reflection is approximately equal to the angle 
of .ncdence, and the angle of incidence is predetermined so that the reflected rays do not enter the acceptance cone 
of the optical system 14. By example, if the acceptance cone of the imaging system 14 includes only rays that diverge 
by up to 1° from the optical axis, then an angle of incidence that is 2° from the optical axis (or 88° with respect to the 
surface of the wafer) is sufficient to insure that any specularly reflected rays will not reach the focal plane of the CCD 
camera 16. 

As a result, the edges associated with rays C and D contribute to the formation of 'bright' areas in the CCD camera 
image while the planar regions associated with rays A and B do not contribute to the image and are thus "dark" 

The line pairs of Figs. 4b and 4c combine to outline the inverted islands in the composite image of Fig 4d which 
also md.cales. by the presence of the dark central region, that there are no circuit details inside these squares 

In general, this technique can employ a single source that is moved relative to the water, or can use multiple 
sources disposed at different locations, or can employ a ring or annular source that simultaneously applies off-axis 
illumination from all directions relative to the wafer. The source or sources may provide various wavelengths polari- 
zations and .ncdence angles to enhance the light signature recording process, and may be used to code various image 
regions on the wafer surface. By example, and referring to Fig. i. a polarizer plate 19 can be interposed within the 
beam of the olf-axis light source 18. By rotating the polarizer plate 19 various polarization states can be introduced 
into the otl-axis beam. 

As is seen in the detail of Fig. 4f. the camera field of view also contains a planar silicon reflecting surface 30 which 
prov.des a "black level" planar reference surface. This silicon structure may be thought ot as a 'picture frame' that 
surrounds the image of (he silicon waler. The off-axis illumination that strikes the reference surface (RS) experiences 
specular reflection (ray A) : just as does the ray B that strikes an unpatterned region of (he silicon wafer and is thus 
not de(ected by (he camera 16. In con(radis(inc(ion, (he off-axis illumina(ion (hat strikes a patterned region of the wafer 
experiences scattering and diffraction (rays C). and a portion ol the scattered and diffracted illumination is detectable 
by the camera 1 8. The Si reference surface RS can be seen in the le(l-mos( region in all the images shown in Figs 4a-4e 
In accordance with an aspect of this invention the electronic in(ensi(y level in the area corresponding to the refer- 
ence surface RS ,s used to determine a level at which to threshold ihe images of Figs. 4b and 4c. i.e. . any pixels above 
the silicon reference level are set to 0.0 (black) and those below the silicon reference level are set to 1 0 (white) This 
thresholding of the image generates the binary mask 32 that is illustrated in Fig. 4e. In Fig. 4e those areas that appear 
black contain scattering and diffracting features, while those areas that appear white are free of such features and are 
associated only with planar film systems wherein accurate film thickness determinations can be made using pre^om- 
puted hbrary funcl.ons. In some cases the edges of the resuKing mask may be irregular. However, the irregularities 
can be removed by (he use of known image processing (echniques. such as fitters, to remove isolated pixels 

The collect.on of the images of Figs. 4b and 4c is preferably made during (he acquisition of all the mulii^pectral 
images, but before any thickness calculations are performed. 
35 It should be noted that the technique described thus far does not require any special waler positioning or rotational 

alignment. o(her than tha( required (o determine where Ihe map is being measured for archival purposes 

An examination of the various regions of the wafer image shown in Fig. 4a with a high power microscope capable 
of reso ving subm.cron lines revealed that all of the areas which show scatter and diffraction contain circuit features 
Most ol the rectangular regions contain vertical line paUerns. The empty rec(angles on the left side (see Fig 4q) do 
not contain features, and so with the exception of the surrounding edge regions, these squares contain only planar 
films as indicated by the corresponding bright portions ol (he mask of Fig. 4e. 

Thickness maps were genera(ed lor (he entire 600x64 pixel image ol Fig. 4a using a numerical spec(ral library 
28a which contained spectra of a single Si0 2 layer (0 to 4.0 microns) on silicon, since i( was known (ha( the wafer had 
been coated w.th such a layer. Figs. 5a and 5b show a comparison of the thickness maps (32x300 points) obtained 
both w.th (F,g. 5a) and without (Fig. 5b) the use ol the mask of Fig. 4e. In this case thicknesses were determined for 
each ol the 38,400 pixels and the mask was then used as a multiplier. It can be seen that the unmasked map ol Fig 
5a contains numerous incorrect values, some of which are due to the edge ol the silicon reference surface RS but 
most ol which are due to scattering and dilfraclion ellec(s from Ihe paUerned waler surface. The masked map of Fig 
5b contains tar fewer points, since the area under consideration includes large areas containing circuit details which 
so were excluded from the thickness determination. 

However, the thickness values of Fig. 5b can be shown to fall into two main bands, as is indicated when contrastinq 
the histogram of Fig. 6a wi(h (hal of Fig. 6b. In the unmasked case of Fig. 6a. 5% to 6% of the pixels give an incorrect 
answer, whereas (he use of (he mask 32 of Fig 4e reduces (he number of incorrecl pixel values to less than 0 1% 
The width of the histogram peaks also indicates the degree of film thickness uniformity over the field of view that was 
" used (5mm x 0.5mm). 

It is apparent from the histogram data that two film thickness values are predominant, but these values are difficult 
to discern from a three dimensional plot. Figs. 7a and 7b show these two major components of the masked thickness 
map obta.ned by thresholding the data into two ranges (0.0 to 1 .9 microns) and ( 1 .9 to 2.5 microns), and further illustrate 
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the clari.y that is introduced when the masking technique is used. In Fig. 7a the rectangularly shaped regions on the 
left correspond to the top of the apertures shown in Fig 4g. while in Fig 7b the rectangularly shaped reg.ons on the 
left correspond to the film layer within the bottom of each of the apertures in Fig 4g 

Although the masking technique does effectively catalog the patterned wafer surface into planar and diffracting 
regions, it is still possible for there to be several different thin film stacks present within a g,ven area of a wafer If it is 
assumed ha. the two different film thicknesses indicated by the measurement described earlier were actually caused 

SrfSn ^ T\!^ ° nt0 tW ° s,ructures - in one case bare silicon and the other case a titanium nitride film on 
bare s. icon, then the th.ckness determining algorithm would have computed thickness values for both sets of pixels 

ThTfif InH^^ h ° ther inC ° rreCt - For,una,e, V< in cases like ,hi * the merit function, which describes the goodness of 
d^fll, n , u U,0ma ^ a "y ^iculated during the thickness determination, is very useful in discriminating against 
different film stack designs. That ,s. different film systems over different regions exhibit merit functions having different 
average values over the different thin film systems. The correct thicknesses correspond to the lowest average merit 
function since a "perfect fit" would give zero merit function values 

The merit function maps corresponding to the measurement previously described are shown in Figs. 8a and 8b 
where the vertical scale ,s in percent representing the least squares difference in reflectance between the measured 
and pre-computed spectra over all the wavelengths used in the measurement. Fig. 8a shows the unmasked merit map 
w,th values ranging up to 1%. and clear shows the different average merit function values at different places on the 
wafer, .n contrast. Fig. 8b shows the merit function only over the regions allowed by the mask, and clearly shows that 
m aV [ era9e ^ all i es ' 0 u r ,ne meriI ,unc,ion are approximately the same over regions where the computed film thicknesses 
fall into iwo bands. Th.s .s a strong indication that the film materials and film slacks are identical, except lor thickness 
variations in the two regions. 

Further in accordance with this invention a method for creating a catalog for the different regions expected to be 
found on a patterned wafer or. by example, an LCD flat screen display, is illustrated in Fig. 9. A main division between 
p anar and scattering/diffracting regions is augmented by using goodness of fit maps to distinguish between different 
planar stacks or different pattern/film stack combinations. I. should be noted that the two libraries (planar and diffracting) 
are not independent, and that the measurement data obtained from planar layer regions can also be used to comple- 
ment analysis of the diffracting regions. The predominant errors in precomputing the numerical libraries for planar 
regions are caused by uncertainties in the optical constants for the different layers. Except for the well known list of 
certain materials, namely silicon, steam grown SiQ 2 . air and water, most materials used in semiconductor manufacture 
exhibit optical properties which depend upon the deposition process used. 

Computing the numerical libraries for the various diffracting regions found on a wafer requires that the coherent 
coupling interacts between the patterned layers be included. This in turn requires a priori knowledge of the compe- 
ar! The SufaStons 1 Pa,tem ' *"* th6Se '° ^ ° 3n be ' feated as arrays °' sha P^s 
In Fig. 9. having established the mask for a given wafer or portion of a wafer as described above (Block A) those 
regions des.gnated as planar regions are processed with the established planar libraries for the various types of film 
« r . e h 9 ' 0nS d < eSi9na,ed 35 di « ra *in 9 (i e.. regions having circuit features) are processed with the es- 
tabhshed diffractmg libraries for d.screle patterned wafer regions. In this manner a thickness profile that incorporates 
both planar and patterned wafer regions is obtained. 

The thickness measurements described herein are for what could be termed an uncooperative wafer i e no prior 
knowledge was avertable as to the circuit patterns other than the fact that the entire surface had been coated with a 
plananzmg layer of S.0 2 . In practice prior knowledge of the various film structures and possibly the local geometric 
patterns of circuits structures can be accurately determined prior to making the thickness determination 

Although described in the context of presently prelerred embodiments of this mven.ion, it should be understood 
that a number of modifications can be made to these embodiments, and that these modifications will fall within the 
scope of he teaching of the invention. By example, if a GaAs wafer is being imaged it may be preferable to also employ 
GaAs as the reference surface (RS) material for the pixel thresholding operation. Also by example, the filter wheel 24 
can be replaced with a moveable grating or a prism for providing illumination with multiple wavelengths 

„r /^K Wh !lf ,n ! en "° n h3S b8en P ar,icular, y shown and described with respect to preferred embodiments there- 
of. ,t will be understood by those skilled in the art that changes in form and details may be made therein without departing 
from the scope and spirit of the invention. a 



Claims 



An apparatus for detecting a thickness (t) of at least one layer (1 2a) disposed over a surface of a wafer (t 3) havinq 
one or more first regions provided with circuit and other features, and one or more second regions showing an 
absence of circuit and other features, comprising: 
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- an optical system (14) for collecting light reflecting from said at least one layer (12a) and said surface of said 
wafer (13), said optical system (14) having an optical axis (OA 1 4a) and a cone of acceptance angles disposed 
about said optical axis (OA; 14a); 

- a camera (16) coupled to said optical system (14) for obtaining an image from the collected light: 
a first light source (22; LSI) for illuminating said layer (12a) with light; and 

a data processor (28a) having an input coupled to an output of said camera (16), 
characterized by 

- said first light source (22; LS1 ) emitting light that is directed along said optical axis (OA; 1 4a) and within said 
cone of acceptance angles; 

- at least one second light source (18; LS2, LS3) for illuminating said layer (12a) with light that is directed oft 
said optical axis (OA: 14a) and outside of said cone of acceptance angles; and 

- said data processor (28a) obtaining from said camera (16) first pixel data corresponding to at least one first 
image obtained with light from said first light source (22; LS1 ), and obtaining second pixel data corresponding 
to at least one second image obtained with light from said at least one second light source (18; LS2, LS3), 
said data processor (28a) including means (28b. 28c) for generating an image mask from said second pixel 
data for distinguishing said first wafer regions from said second wafer regions, and further including means 
for detecting a thickness (t) of said at least one layer (12a) within said second regions in accordance with said 
first pixel data and in accordance with predetermined reference pixel data. 

The apparatus of claim 1 , characterized in that said optical system (1 4) is comprised of a telecentric optical system. 

The apparatus of claim 1 or 2, characterized in that said first light source (22; LS1) provides illumination with a 
first incidence angle on said layer (12a). the first incidence angle being an angle that causes specularly reflected 
light to enter said cone of acceptance angles of said optical system (14). and that said second light source (18; 
LS2, LS3) provides illumination with a second incidence angle on said layer (12a). the second incidence angle 
being an angle that causes specularly reflected light to not enter said cone of acceptance angles of said optical 
system (14). 

The apparatus of any of claims 1 - 3, characterized in that said first light source (22; LS1 ) includes means (24) for 
sequentially illuminating said layer (12a) with light having different predetermined wavelengths, that said camera 
(16) obtains a plurality of first images, individual ones of the plurality of first images being obtained with light having 
one of the predetermined wavelengths, that said data processor (28a) includes means (28c). responsive to each 
of said plurality of first images, for comparing associated first pixel data values corresponding to one or more of 
said second regions with individual ones of a plurality of sets of predetermined image pixel values, each of the 
plurality of sets corresponding to a diflerent layer thickness (t), and that said data processor (28a) further includes 
means for selecting as a layer thickness value (t) a thickness associated with a set that gives a best match with 
the first pixel data values. 

The apparatus of any of claims 1 - 4, characterized in that said at least one second light source is comprised of 
first and second second light sources (LS2, LS3) that are disposed for illuminating said layer (12a) from different 
directions. 

The apparatus of any of claims 1 - 5. characterized by means (19) for varying a polarization state of said second 
light source (18). 

The apparatus of claim 1 . characterized by a reference surface (RS; 30) that is disposed within a field of view of 
said camera (16). said reference surface (RS; 30) being oriented with respect to said surface of said wafer (13) 
for specularly reflecting light from said second light source (1 8; LS2, LS3), and wherein said image mask generating 
means is responsive to second pixel data corresponding to an image of said reference surface (RS; 30) for thresh- 
olding said second pixel data into first image mask regions corresponding to said first wafer regions and second 
image mask regions corresponding to said second wafer regions. 
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8. A method for determining a thickness (t) of a layer (12a) disposed over a surface of a substrate (13), comprising 
me sieps oi. 

- illuminating the layer (1 2a) with light having a first, incidence angle: 

- obtaining at least one first image of the layer (1 2a) with light reflecting from the layer (12a) and the substrate 

( » 3), 

- illuminating the layer (12a) with light having a second incidence angle: 

- obtaining at least one second image of the layer (1 2a) with light reflecting from the layer (1 2a) and the substrate 
(13). the reflected light being primarily due to a presence of features within the layer (12a) or substrate (13) 
that scatters and/or diffracts the light having the second incidence angle; 

- determining from the at least one second image a location of one or more regions of the layer (1 2a) or substrate 
(13) having the features and a location of one or more regions of the layer (12a) or substrate (13) not hav.no 
the features; and a 

- determining, in accordance with the at least one first image, a thickness (t) of the layer (1 2a) only within the 
one or more regions not having ihe leaiures. 

9. The method of claim 8. characterized in that the first incidence angle is an angle that causes specularly reflected 
light to enter an acceptance cone of an optical system (14) used to obtain the firsl image, and that the second 
incidence angle is an angle that causes specularly reflected light to not enter the acceptance cone of the optical 
system (14) used to obtain the second image. 

10. The method of claim 8, characterized in that 

- the step of illuminating the layer (12a) with light having a first incidence angle includes the sub^tep of se- 
quentially illuminating the layer (12a) with light having different predetermined wavelengths: 

- that the step of obtaining at least one first image of the layer includes the sub-step of obtaining a plurality of 
irst .mages, individual ones of the plurality of first images being obtained with light having one of the prede- 
termined wavelengths; and M 

that the step of determining includes the substeps of: 

- for each of the plurality of first images, comparing image pixel values corresponding to the one or more 
regions not hav.ng the features with individual ones of a plurality of sets of predetermined image pixel 
values, each of the plurality of sets corresponding to a different layer thickness; and 

-- selecting as a layer thickness value (t) a thickness associated with a set that gives a best match with the 
image pixel values. 
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(54) Automatic rejection of diffraction effects in thin film metrology 

with predetermined reference pixel data (Fig. 1). 



(57) A layer thickness determination system (10) is 
employed for detecting a thickness of at least one layer 
(12a) disposed over a surface of a wafer (13) having 
one or more first regions characterized by circuit and 
other features, and one or more second regions char- 
acterized by an absence of circuit and other features. 
The determination system (10) includes an optical sys- 
tem (14) for collecting light reflecting from the at least 
one layer (12a) and the surface of the wafer (13). The 
optical system (14) is preferably a telecentric optical 
system, and has an optical axis (OA; 1 4a) and a narrow 
cone of acceptance angles disposed about the optical 
axis (OA; 14a). The determination system (10) further 
includes a camera (16) coupled to the optical system 
(14) for obtaining an image from the collected light; a 
first light source (22) for illuminating the layer (1 2a) with 
light that is directed along the optical axis (OA: 1 4a) and 
within the cone of acceptance angles; and at least one 
second light source (18) for illuminating the layer (12a) 
with light that is directed off the optical axis (OA; 14a) 
and outside of the cone of acceptance angles. A data 
processor has an input coupled to an output of the cam- 
era (16) for obtaining from the camera (16) first pixel 
data corresponding to at least one first image obtained 
with light from the first light source (22) and for obtaining 
second pixel data corresponding to at least one second 
image obtained with light from the at least one second 
light source (18). The data processor operates to gen- 
erate an image mask from the second pixel data for dis- 
tinguishing the first wafer regions from the second wafer 
regions, and further operates to detect a thickness of 
the at least one layer (12a) within the second regions in 
accordance with the first pixel data and in accordance 
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